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Abstract 

We describe a simple and efficient algorithm for two- 
view triangulation of 3D points from approximate 2D 
matches based on minimizing the L 2 reprojection error. 
Our iterative algorithm improves on the one by Kanatani 
et al. [5] by ensuring that in each iteration the epipolar 
constraint is satisfied. In the case where the two cameras 
are pointed in the same direction, the method provably con- 
verges to an optimal solution in exactly two iterations. For 
more general camera poses, two iterations are sufficient to 
achieve convergence to machine precision, which we exploit 
to devise a fast, non-iterative method. The resulting algo- 
rithm amounts to little more than solving a quadratic equa- 
tion, and involves a fixed, small number of simple matrix- 
vector operations and no conditional branches. We demon- 
strate that the method computes solutions that agree to very 
high precision with those of Hartley and Sturm’s original 
polynomial method [2], though achieves higher numerical 
stability and 1-4 orders of magnitude greater speed. 

1. Introduction 

Triangulation is one of the most fundamental problems 
in computer vision. The problem can be stated as follows: 
Given a 3D point X projected to Xi = PiX in two or more 
cameras, recover the 3D position of X from its 2D projec- 
tions. When X is consistent with the matched points x % , 
this is a trivial linear problem. In practice, however, the 
measured and reprojected points do not exactly coincide, 
which causes the rays from the camera centers through the 
imaged points not to intersect in 3D. This may be due to un- 
certainties in relative camera poses or intrinsics (i.e. errors 
in Pfi, or to the inherent difficulties in designing automated 
methods that perfectly match points to within subpixel ac- 
curacy across images (i.e. errors in xfi. In the presence of 
noise, the triangulation problem becomes one of finding the 
3D point that best describes the observed image points. 

Several criteria and error functionals for triangulation 
have been proposed in the literature [1, 3, 9, 10]. Under the 
assumption that the imaged points are perturbed by Gaus- 


sian noise, the optimal, maximum likelihood solution [2] 
minimizes the L 2 reprojection error 

d(x, x) 2 + d(x', x') 2 (1) 

where d is the Euclidean image-plane distance, x and x' are 
the observed points, and x and x' are corrected points that 
satisfy the epipolar constraint [7] 

x T Fx' = 0 (2) 

Here F denotes the rank-2 fundamental matrix defined for a 
pair of cameras. When satisfied, the epipolar constraint im- 
plies that the corresponding rays intersect at a point in 3D. 
A globally optimal solution to this non-convex constrained 
optimization problem is due to Hartley and Sturm [2] (of- 
ten referred to as optimal triangulation or the polynomial 
method ), and amounts to Ending all roots of a degree- six 
polynomial. Though theoretically optimal, the task of re- 
liably Ending polynomial roots using finite precision arith- 
metic is nontrivial. Moreover, the relative difficulty of im- 
plementing this method and its computational cost can be 
considerable. Henceforth we will refer to this method as 
hs. 

Iterative schemes are possible alternatives to Hartley 
and Sturm’s “direct” method. The bundle adjustment ap- 
proach [12] optimizes the position of the 3D point explicitly 
(and possibly the camera parameters as well), and thus en- 
sures that the epipolar constraint is satisfied. Its main draw- 
back is a reliance on a good initial guess of the 3D posi- 
tion of the point, and even with reasonable estimates such a 
method often fails [8]. On the other hand, iteration over the 
positions of the 2D points (x,x r ) has the benefit of a good 
initializer (x, x'), and here the main concern becomes one 
of ensuring that the epipolar constraint is met upon conver- 
gence. Kanazawa and Kanatani [6] proposed such a method 
that converges quickly but generally to a solution that does 
not satisfy Eq. (2). Kanatani et al. [5] more recently pre- 
sented an improved higher-order method that does satisfy 
the epipolar constraint, and which converges to a local ex- 
tremum of the reprojection error. Like [6], Kanatani et alls 
method, which we will refer to as ksn, solves in each it- 
eration a linear equation, but improves on [6] by incorpo- 
rating information from the previous iteration. Aside from 


no guarantees on global optimality, the chief drawback of 
this method is the number of iterations required — especially 
in unstable camera configurations — and the need to test for 
convergence. Moreover, early termination of this iterative 
scheme is not possible as the epipolar constraint is gener- 
ally not satisfied until convergence. 

In this paper we propose a new image-space iterative 
method for two-view triangulation that takes optimal steps 
(Ac*;, Ax' k ) in each iteration k by solving a quadratic rather 
than linear equation. In particular, each intermediate iterate 
(xk, x' k ) is guaranteed to satisfy the epipolar constraint, and 
hence forms a valid (though possibly suboptimal) solution, 
allowing for early termination. We show that as Kanatani 
et al.’s higher-order method converges, its steps approach 
those computed by our method. We further prove that when 
the two cameras’ principal axes are parallel (i.e. the cam- 
eras face in the same direction), the method converges in 
exactly two iterations. This is an important use case that 
occurs, for instance, in (zero toe in) stereo photography and 
structure from motion from a single forward-facing camera. 
In practice, convergence to very high precision (more than 
10 digits) is achieved in at most two iterations for more gen- 
eral camera poses as well; a result that we use to simplify 
the second iteration for a fully non-iterative method. In ad- 
dition to being very simple and efficient, we show using 
extensive experiments that the solutions delivered by our 
method are in excellent agreement with those of Hartley 
and Sturm’s, but at only a tiny fraction of computational 
cost. In fact, due to the polynomial method’s susceptibil- 
ity to ill-conditioning, our method often yields lower repro- 
jection errors in practice. Thus, although not theoretically 
optimal, our method is a practical and sometimes preferred 
alternative to the polynomial method. 

We begin by presenting an iterative version of our algo- 
rithm and discuss its similarities with the one by Kanatani 
et al. We later prove that when the cameras point in the 
same direction two iterations suffice, and that the solution 
is indeed an extremum of the reprojection error. We then 
rewrite parts of the algorithm to make it more efficient, and 
also discuss how to make it numerically robust. We con- 
clude with experimental results and a discussion of future 
work. 

2. Preliminaries 

For simplicity of presentation, we will assume that the 
cameras are calibrated and that the point matches (x. x') 
are normalized, i.e. they have been premultiplied by the in- 
verse camera calibration matrix K~ l . This allows us to 
work with essential matrices E = [t] x R, where R and t are 
the relative orientation and translation of the two cameras, 
and where [-] x represents the cross product operator (see 
also [3]). The uncalibrated case can be treated similarly, 
with F replacing E, as long as pixels are square; a prereq- 


uisite for the Euclidean reprojection error to be meaningful. 
We represent points in the image plane using homogeneous 

coordinates, e.g. x = (xi x 2 1) . Using this notation, 
the epipolar constraint becomes 

x T Ex' = 0 (3) 

Vectors in the image plane have only two components, and 
to combine 2-component vectors and homogeneous points 
we define the matrix 

a A o o\ 

5 =(o 1 oj (4) 

Thus S T Ax are the 3D coordinates of the 2D displacement 
Ax. We will make frequent use of 

E = SES t (5) 

which is the upper left 2x2 submatrix of E. 

The epipoles e = t and e! — R J t are the projections of 
the camera centers c and c', and are the left and right null 
vectors of E, respectively. The line through e! and an image 
point x' in the second camera projects to an epipolar line 
£ = Ex' in the first camera. Thus a point x on £ satisfies 
£ J x = 0. Similarly, i' = E J x. We use ( x,x' ) to denote 
the measured points, and (x,x r ) the corrected points. 

3. Optimality conditions 

Optimal triangulation, as formulated by Hartley and 
Sturm [2], can be expressed as the following minimization 
problem: 

minimize Ax T Ax + Ax' 1 Ax' 

subject to (x — S t Ax) t E{x' — S T Ax') = 0 

with Ax = S(x — x), Ax' = S(x' — x'). Equation (6) 
is a quadratically constrained quadratic optimization prob- 
lem. We introduce a Lagrange multiplier — 2 A for the scalar 
constraint, resulting in the Lagrangian 

f(Ax , Ax', A) = Ax t Ax + Ax' J Ax' 

-2X(x- S 1 Ax) 1 E(x' - S T Ax') (7) 

Setting the gradient of / to zero, we obtain the following 
quadratic equations: 

x J Ex' = {x- S t Ax) t E(x' - S T Ax') = 0 (8) 

Ax = XSE(x' - S T Ax') = XSEx' = An (9) 

Ax' = X SE J (x - S J Ax) = X SE J x = An' (10) 

These are one scalar and two vector equations for a total of 
five constraints involving five unknowns (Ax, Ax', and A). 


Equation (8) implies that x and x' must lie on correspond- 
ing epipolar lines £ = Ex' and £' = E J x. Equations (9) 
and (10), on the other hand, constrain x and x' to lie on 
lines orthogonal to £ and £' that pass through x and x', re- 
spectively (see Fig. 1). In other words, Ax and Ax’ are 
parallel to the normal vectors n and n' of the epipolar lines. 

The geometric consequences of these line intersection 
constraints are well known: Zxxe must be a right angle, 
and hence the optimum in the first camera must lie some- 
where on the smallest circle in the image plane that contains 
x and the epipole e (and similarly for the second camera). 
The approach taken by Hartley and Sturm is to parameterize 
this circle in the first camera via a stereographic projection 
from the epipole e onto the line (0 0 1) T x (e — x) (the 

vertical line through x = Xq in Fig. 1). Each point on this 
line can then be expressed using a single parameter t, and 
can later be mapped back to the corresponding point on the 
circle. 

In addition to the directional constraints of Eqs. (9) 
and (10), these equations impose a magnitude constraint on 
an optimal solution, i.e. that the Lagrange multiplier A take 
on the same value in both equations. We can thus express 
Eqs. (8) to (10) intuitively as a set of conditions that any 
optimal solution (x, x') must satisfy: 

(i) x and x! lie on corresponding epipolar lines. 

(ii) x and x' are the projections of x and x' onto these 
epipolar lines. 

(iii) The corrections Ax and Ax' are linearly related by a 
single parameter A. 

Note that these three requirements are necessary for a pair 
(x,x') to satisfy the epipolar constraint and be an extremum 
of the reprojection functional /. They are, in general, not 
sufficient for global optimality. As Hartley and Sturm point 
out, as many as three minima and three maxima may ex- 
ist. (Since / is parameterized over a circle, it is periodic, 
and hence there are as many minima as there are max- 
ima.) In practice, however, / has with very high probability 
(> 99.6% in our experiments) a single minimum, and hence 
our primary goal is to satisfy the three constraints above. 

4. Iterative algorithm 

Our iterative approach to satisfying the optimality con- 
straints is to linearize Eqs. (9) and (10) by replacing on the 
right hand side the optimal (x, x') points with the current 
best estimate (xk,x' k ). The measured points x — xq and 
x' = x' 0 serve as initial estimates. Substituting the cur- 
rent estimates Axk and Ax' k into Eq. (8) results in a single 
quadratic equation in the unknown A k : 

A lnlEn' k - \ k {n[n k + n'-J n' k ) + XqEx' q = 0 (11) 



Figure 1: Iterative refinement of x. The optimum, £ 2 , is 
the orthogonal projection of the measured point xq onto the 
epipolar line e = Ex[ = Ex' 2 , and lies on the smallest 
circle through xq and the epipole e. 

with n k = SEx' k , n' k = SE J x k . Of the two possible roots, 
we choose the smaller one (in magnitude) as A governs how 
far to step from the measured points. Given this value of A 
we update the displacements Ax and Ax' using Eqs. (9) 
and (10), and consequently the positions (x,x'), and con- 
tinue with the next iteration. The resulting algorithm is pre- 
sented in Listing 2. The issue of convergence is not ad- 
dressed here, and will be discussed in more detail below. 

Note that by solving Eq. (8), we explicitly enforce the 
epipolar constraint in each iteration. Moreover, the dis- 
placements Ax and Ax' are based on a single step size, A, 
and hence two of the three optimality conditions are met. 
This is in contrast to the method by Kanatani et al. [5], 
which enforces the epipolar constraint only upon conver- 
gence. In fact, the only difference between our iterative 
method and Kanatani’s is the step size A. The two meth- 
ods employ the same step directions n k and n' k (as does the 
original method [6]), i.e. from the measured points (xo, x' 0 ) 
in a direction orthogonal to the current epipolar lines asso- 
ciated with (x k ,x' k ). These directions are in some sense 
the best possible choice, as upon convergence they point to 
the optimum. By also satisfying the epipolar constraint, we 
additionally take the best possible length step. 

Whereas our step size X k is the root of a quadratic equa- 
tion, one can show that the X k computed by Kanatani’s 
method is a solution to 

A ln k En' k - X k (njn k + n[ T n' k ) + XqEx ' 0 = 

(Ax k - Ax k -i) J E(Ax' k - A4_i) (12) 

Since Ax k = A k n k and Ax' k = X k n' k , the quadratic term 
X k cancels, resulting in a linear equation in X k (see List- 
ing 1). As Kanatani’s method converges, the right-hand- 
side of Eq. (12) approaches zero, and hence the X k com- 


ksn (x 0 ,x' 0 ,E) 

1. Ax o <— 0 

2. Ax o «— 0 

3. for k = 1, . . 

4. 

5. 

6 . 

7 . Axfc 4 XfcTik 

8. 4 - A fc n' fe 

9. 4 - a; 0 - 

10. zcj. 4 - x' 0 — S J Ax' k 

Listing 1: The higher-order method by Kanatani et al. [5]. 
Differences with respect to Listing 2 are highlighted. 


rik <— SEx k _ 1 
n' k 4 — SE J x k -i 

\ x 0 Ex 0 — Ax k _ 1 EAx k _ 1 

k n T k n k +n' k T n' k 


iter (x 0 ,x' 0 ,E) 

1. for k = 1, . . . 

2. n k 4 — SEx' k _ 1 

3. n' fc 4 — SE J x k - i 


4. 

cik 4 - 

~ n lEn' k 

5. 

b k 

- \(ri[n k + n'± ri k ) 

6 . 

Ck <- 

^ O 

+ 0 

1 

7. 

d k 

- ^]b 2 k - a k c k 

8 . 

A k 

C-k 

b k +sgn(b k )d k 


9. Ax k 4 X k 7i k 

10. ArJ. 4 - A fc n' fc 

11. x k _ 

12. x' k <— x' 0 - S T Ax' k 

Listing 2: Our iterative method. 


puted by Kanatani’s method approaches the quadratic solu- 
tion obtained directly by our method (c.f. Eq. (11)). 

We remark that our implementation uses a careful choice 
in the computation of A. Since we are interested only in the 
smaller of the two roots, we compute A = b+S g n ^ d instead 

of the equivalent A = b ^ s ^d b ) d t0 guarantee effective ad- 
dition instead of subtraction. This avoids the potential for 
catastrophic cancellation (c.f. [11, §5.6]). 

When the error functional has only one minimum, as we 
found to be the case in more than 99.6% of our experiments, 
our iterative method quickly converges to that minimum. 
Choosing the larger of the two roots for A in each iteration 
takes us instead to the single maximum. 

Once the corrected points (x, x') have been computed, 
any method can be used to infer the 3D point X, as the two 
rays must intersect. When pose and intrinsics are known, 
the method outlined in [ 6 ] may be used: 


z = x x Rx' 


z 1 Ex' A 
X = — = — x 


Z J Z 


(13) 


4.1. Convergence criterion 

To assess convergence, it is important to ask what we 
wish to converge to. A “small enough” change in (xk,x' k ), 
though normally an indicator of convergence, does not im- 
ply that the optimality conditions are met. Since in our 
method and as well [5, 6 ] condition (iii) — single A — is guar- 
anteed, only the remaining two need examination. That 
is, the solution (x, x') must lie at the intersection of corre- 
sponding epipolar lines and on the orthogonal lines through 
(x,x / ). In other words, ( x,x' ) must be the projection of 
(x,x') onto the epipolar lines defined by (x : x r ). As a 
general convergence criterion, we measure the distance of 
(x : x r ) to this intersection, which accounts for both epipo- 
lar constraint violation (by methods like ksn, for instance) 
and optimality along the epipolar lines. 


5. Non-iterative algorithm 


The advantage of our method over Kanatani’s, aside 
from faster convergence, is that any intermediate iterate 
satisfies the epipolar constraint and therefore constitutes a 
valid (albeit possibly suboptimal) solution, thus allowing 
for execution of only a small, fixed number of iterations. 
In fact, a remarkable result is that our iterative algorithm 
can be shown to attain a minimum in exactly two itera- 
tions when the cameras face in the same direction, regard- 
less of translation and the remaining rotational degree of 
freedom (see Appendix A). In this case the first iteration 
finds the correct epipolar lines, and the second iteration sim- 
ply moves the points along these lines to the projections 
of (x,x'). For more general camera poses, we observed 
that our method, with extremely high likelihood, converges 
in two iterations to 12 digits of precision or more. This 
suggests the possibility of running only the first iteration to 
identify the epipolar lines, followed by a projection step. 

In the general setting, this projection step may violate 
condition (iii), since then the resulting step sizes in the two 


views may require different values 


Axjri2 i Ax'f n'r 


n1n 2 


and 


A. An alternative solution is to fix A by computing 


n'fn'z 


for 


Ao — 


Axjri2 + Ax\ J n'r 
njn 2 + n'2 n 2 


— Ai 


2di 


nJ ri2 + n'o " n 


/ T 


(14) 


This is equivalent to applying one step of iter followed by 
one step of ksn. In the general case this linearization of A 2 
comes at the expense of violating the epipolar constraint (i). 
As we shall see, the discrepancy is usually on the order of 
10 -15 , and is for all practical intents inconsequential. 

Our non-iterative algorithm, implemented both ways, is 
presented in Listing 3. A C++ implementation of the faster 
niter 2 version compiles to 36 additions/subtractions, 
49 multiplications, 2 divisions, and 1 square root, for a total 
of 88 floating-point scalar arithmetic operations. 



niter 1 (x,x', E ) 

1. n <— SEx' 

2. n' <- 

3 . a <— n T Err! 

4. b ^(n T n + n /T n / ) 

5. c <— x T Ex' 

6 . d <— \Jb 2 — ac 

8. Z\x An 

9. Ax' An' 

10. Tl < — Tl — EAx' 

11. n' n' — E T Ax 

12. Ax 


Ax J n 


-n 


Ax ,J n' / 
/ t 7 n 


13. Ax' <- 

w 1 w 

14. x <— x — S T Ax 

15. x' x' — S J Ax' 


niter2 (x, x' , E) 

1. n <— AfAc' 

2. n' <- SE T x 

3. a <— n T En' 

4. b ^(n T n + n' T n') 

5. c <— x T Ex' 

6. d <— \Jb 2 — ac 

8. Ax <— An 

9. Z\x' An' 

10. n < — n — EAx' 

11. n' n' — E T Ax 

12. A «— A T 2d /T 7 

13. An 

14. Ax' An' 

15 • 0C ^ *2^ 

16. x' a/ — 5 T Ar' 


Listing 3: Our two non-iterative methods with differences 
highlighted, niterl guarantees conditions (i) and (ii), 
whereas niter 2 guarantees condition (iii). 


6. Results 

We evaluated our method on both synthetic and real 
data and made comparisons with our C++ implementations 
of Hartley and Sturm’s method [2] (hs) and the one by 
Kanatani et al. [5] (ksn). Our data sets exhibit a wide va- 
riety of more than 100,000 relative camera poses. Statistics 
on these data sets are reported in Table 1. 

6.1. Synthetic data 

We begin by examining how our method performs on 
the synthetic data. Each of these data sets is comprised 
of 10 x 10 random point clouds of 10,000 points each, 
projected onto images of size 1,024 2 pixels, with a focal 
length of 512 pixels. These point clouds have a Gaussian 
radial distribution with a standard deviation of \ relative 
to the baseline. Each member of this 10 x 10 ensemble 
corresponds to a certain image-space Gaussian noise level 
a = 2 n in pixels, with n = —5, —4, . . . , +4, and distance 
from center of baseline S = 2 d for d = — 1 , 0 ,..., +8. 
In “orbital” the cameras point at the point cloud centroid; 
in “lateral” the cameras point in the same direction and 
are translated laterally (a stable configuration simulating a 
stereo camera); while in “forward” the cameras also point in 
the same direction but one is translated directly forward (an 
unstable configuration simulating a forward-facing camera 
attached to a vehicle or robot). 

Table 1 lists the number of real polynomial roots found 
by hs. For root Ending we first implemented the popu- 
lar eigenvalue method [11]. In spite of the polynomial be- 



Orbital Lateral Forward 

Corridor 

Dinosaur Notre Dame 

Cameras 

20 

2 

2 

11 

36 

715 

Poses 

10 

1 

1 

55 

322 

118,467 

Points 

10 6 

10 6 

10 6 

737 

4,983 

127,431 

Matches 

10 6 

10 6 

10 6 

12,003 

25,841 

4,579,786 

1 root 

2 roots 

CD 

o 

10 G 

10 6 

12,003 

25,841 

4,553,033 

4 roots 






26,753 

Agreement 

8 

16 

8 

6 

7 

3 


Table 1: Data sets used. “Poses” denotes the number of 
camera pairs in which at least one point is visible. A point 
may partake in multiple pairwise “matches.” “Roots” refers 
to the number found in the C++ implementation of hs. 
“Agreement” is the minimum number of digits to which the 
hs and niterl reprojection errors agree. 


ing of low degree, we found this method unreliable due to 
poor conditioning. With relative coefficient magnitudes of- 
ten spanning more than the 16 digits of precision available 
in a double, we observed catastrophic loss of precision in 
the eigensolver. Instead we used Bond’s implementation of 
the well-known Jenkins-Traub root Ending method [4] from 

http : / /www . crbond . com/ download/mi sc/rpoly . cpp. In 

addition to complexity of implementation, the need for a so- 
phisticated root finding technique in hs are two arguments 
in favor of our much simpler method. 

Returning to Table 1, with the epipoles at infinity in the 
“lateral” case the polynomial is reduced to linear, and hence 
only one root exists. The other two synthetic data sets result 
in exactly two roots in all two million cases. This table also 
lists the degree of precision at which the resulting reprojec- 
tion error agreed between hs and our non-iterative method 
niterl. This agreement is always at least three digits, and 
in over 99.99% of all cases six digits or more. For each of 
these three data sets, our non-iterative method converged to 
at least twelve digits of precision in two iterations or less. 

Histograms of relative reprojection errors between hs 
and our iterative and non-iterative methods are shown in 
Fig. 3. We compute the relative error as AE — mi n{~ "ee '} ’ 
where E and E' are the reprojection errors found by one 
of our methods and hs, respectively. Thus AE — ±10“ 15 
means that our method agreed with hs to 15 digits of preci- 
sion. The histogram bins span ±[10', 10 z+1 ); blue columns 
show cases where our method found a better solution, green 
indicates the range (— 10 -17 ,+10 -17 ) around machine ep- 
silon, while red columns indicate that hs did better. 

Evidently our method compares well with hs in 
practice — in spite of hs being globally optimal in theory — 
with relative errors more or less exhibiting a normal distri- 
bution around zero, and with our three methods producing 
qualitatively similar histograms. In the worst case, niterl 
exceeded the reprojection error reported by hs by one part 
in 10 8 , while ensuring that the epipolar constraint was met. 


6.2. Real data 

We also evaluated our method on the multi-view data sets 
“dinosaur” and “corridor” from http : / / www . robots . ox . 
ac.uk/~vgg/data/data-mview.html, and “Notre Dame” 
from http : / /phototour . cs . Washington . edu/ data sets/. 

Exactly two roots were found for all point matches in the 
first two data sets, while hs sometimes reported four roots 
for the third. Indeed, this is the only data set for which we 
found more than two roots (but not more than four). We sep- 
arated out these cases to test whether our method would get 
stuck in a local minimum. The dark columns in the bottom 
row of Fig. 3 suggest no particular problems for any of our 
methods in these situations. We also measured the square 
distance of the niter 2 solutions to the epipolar lines, and 
found on both the synthetic and real data this distance to be 
at most 10 -9 (in normalized coordinates); in 99.999% of all 
cases this discrepancy was less than 10 -15 . 

6.3. Convergence 

To demonstrate the superiority in convergence of our 
method over ksn, we counted the number of iterations re- 
quired to converge to 12 digits of precision for the “for- 
ward” unstable camera pose, where the imaged points are 
close to the epipoles. Our method is guaranteed to converge 
in (at most) two iterations regardless of noise and scene dis- 
tance, as was also verified experimentally. Fig. 2 plots the 
average number of iterations required by ksn for different 
levels of noise and distance. As expected, the number of it- 
erations increases both with noise level and distance. In the 
worst case, ksn required 11 iterations to converge. 

6.4. Timings 

Using the synthetic data we evaluated the speed of our 
niter 2 method relative to hs and ksn on a dual 3.2 GHz 
Intel Xeon PC. In Table 2 we also include the results of 
two simple methods that do not minimize reprojection error: 
the midpoint (mid) and linear homogeneous method (dlt); 
see [3]. We note that the timings for both of these meth- 
ods, which compute 3D points directly, include the time to 
project the solutions back to 2D. Nevertheless, even when 
measuring 3D extraction, our method is more than twice 
as fast as mid, which might be considered the simplest tri- 
angulation method known. We also note that on the “or- 
bital” and “forward” data sets our method is 50 times faster 
than Hartley and Sturm’s, and about 20 times faster than 
Kanatani’s own publically available C++ implementation of 
ksn. In case of the “lateral” data set, the Jenkins-Traub root 
finder used by hs spent considerably more time isolating 
the single root, and here our method was 8,000 times faster. 


o = 16 o = 4 — o=l — o = 0.25 



scene distance relative to baseline 


Figure 2: Average number of iterations required by ksn to 
converge as a function of scene distance and noise level a. 



hs 

dlt 

ksn 

mid 

niter2 

Points/sec 

130 K 

140 K 

1.7 M 

1.7 M 

6.5 M 

Speedup 

1.0 

1.1 

13 

13 

50 


Table 2: Speed of computing optimal image points (x,x'). 

7. Conclusions 

We presented a fast iterative method for uncalibrated 
two-view triangulation that takes optimal step sizes so as to 
enforce the epipolar constraint in each iteration. In practice, 
the optimal epipolar lines are found in the first iteration, al- 
lowing the second iteration to be cast as a simple projection 
step that renders the method non-iterative. The problem 
case of unstable camera configurations — where triangula- 
tion methods exhibit their greatest variation in quality and 
speed — is chiefly where our method excels over the one by 
Kanatani et al. [5]. Our simple method is furthermore non- 
iterative and several times faster than theirs. In spite of no 
theoretical optimality guarantees in the general setting, we 
showed our method to generally be more reliable and ac- 
curate in practice than Hartley and Sturm’s, in addition to 
being fifty times faster. 

Future work will investigate the convergence properties 
of our new method in order to gain an understanding of 
whether it may fail, and why the method converges to such 
great precision in only two iterations. The problem treated 
here is limited to two views. We envision possible exten- 
sions to three-view triangulation, which likely will involve 
the solution of cubic equations. 
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Figure 3: Histograms of reprojection error relative to Hartley and Sturm’s method for our iterative and two non-iterative 
methods. Due to numerical issues, our method often finds better solutions (blue columns). Note that the vertical axis for 
Notre Dame is logarithmic to highlight (using darker columns) the cases where hs found four roots and where multiple 
minima may exist. 
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A. Proof of optimality 


We here prove that our method is optimal when the two 
cameras point in the same direction (that is, when their prin- 
cipal axes are parallel). Let R and t be the relative orien- 
tation and translation of the two cameras, and let E be the 
associated essential matrix, with 


R = 


'Q 0 N 

0 1 


t= (x y z) 
E = SES J = 


0 

z 


—2 

0 


Q 


(15) 

(16) 

(17) 


for any orthogonal 2x2 matrix Q and unit vector t. Thus, 


the cameras both point in the direction (0 0 1) 

may easily verify that 

E J E = EE J = z 2 I 

ES t SE t S t SE = z 2 E 

We have in the first iteration 

ai = nj Erii = x' 0 T E J S T SES J SE J x 0 

= x^ES t SE t S t SEx' 0 = z 2 XqEx' 0 


= Z 2 C\ 


Cl 


+ A fn' 1 ‘E , EE T n 1 


2 W?E t > 

— d\ — 2z 2 b\\i T z 2 cl\ A^ 

= a\ + z (a i\ 1 — 2bi\i) = a\ — z c\ 
= 0 


One 

(18) 

(19) 


cli Ai — 2 b\ — — — — 2 b\ — ( b\ T cL \ ) — b\ — d\ 
Ai 

In the second iteration, c 2 = c\ and 
a ,2 = rJ 2 Eri 2 = {n\ — \\ £'n / 1 ) T £'(n / 1 — XiE T ni) 
= njErii — XitiJEE 7 n\ — Xin'^ E J En[ 


( 20 ) 


( 21 ) 


( 22 ) 


b 2 = | {n~[ri 2 + n[ T n 2 ) 

= \{n[ (rii — Ai Erii) + n / 1 T (n / 1 — X\E J ni)) 


= ^(njni — aiAi + tl 1 tl 1 — 01A1) 

= b\ — aiAi = b\ — (61 — di) 

= di 

From Listing 2 and 02 = 0, we obtain a linear expression 

c 2 ci 


(23) 


Ao — 


2b , z 


2di 


(24) 


For the method to converge in the second iteration, we need 
Ax 2 to be the projection of Ax 1 onto ri 2 (see Fig. 1), and 
similarly in the second camera. In other words, we need to 
show that 


Axjri 2 Ax'y n' 2 Axjri 2 + Ax'-J n' 2 

rrR m / T / T 1 / T / 

^2 n 2 n 2 n 2 n 2 n 2 + n 2 n 2 


We limit the proof to the first camera, as the same holds in 
the second camera due to symmetry: 

Axjn 2 njri 2 X\{n[ni — a\Xi) 

ri 2 n 2 ri 2 n 2 njni — 2aiX\ + z 2 X\n' l T n' 1 

Xi(njni - bi + di) 
njm - 2(6i - di) + £^n;V 

Ai(6i+di)(rf[ni-bi-Mi) 

(bi+di)(n[n 1 —2(b 1 —di)) + (b 1 —di)(2bi—ri[ni) 


Ciinjni - bi + di) 
2dinjni — 2b\d\ + 2d\ 
ci(njni - bi + d\) 

2di(njni — b\ + d\) 

°i _ c 2 _ 

2di ~ 2b 2 ~ 2 


(26) 
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